This notebook estimates the gamma factor from a set of 5 μs-ALEX smFRET measurements.
According to Lee 2005 (PDF, SI PDF), we estimate the $\gamma$-factor from Proximity Ratio (PR) and S values (with background, leakage and direct excitation correction) for a set of 5 μs-ALEX measurements.
The PR and S values are computed by the template notebook:
which is executed for each sample using usALEX-Batch.
From Lee 2005 (equation 20), the following linear relation holds:
$$\frac{1}{S} = \Omega + \Sigma \cdot E_{PR}$$Once $\Omega$ and $\Sigma$ are fitted, we can compute the $\gamma$-factor as (equation 22):
$$\gamma = (\Omega-1)/(\Omega + \Sigma-1)$$
In [3]:
from __future__ import division
import numpy as np
import pandas as pd
import lmfit
from scipy.stats import linregress
This notebook read data from the file:
In [6]:
data_file = 'results/usALEX-5samples-PR-leakage-dir-ex-all-ph.txt'
In [7]:
data = pd.read_csv(data_file, sep="\s+").set_index('sample')
data
Out[7]:
In [8]:
data[['E_pr_fret', 'E_pr_fret_kde']]
Out[8]:
In [9]:
res = linregress(data.E_pr_fret_kde, 1/data.S_pr_fret)
slope, intercept, r_val, p_val, stderr = res
For more info see scipy.stats.linearregress
.
In [10]:
Sigma = slope
Sigma
Out[10]:
In [11]:
Omega = intercept
Omega
Out[11]:
In [10]:
r_val
Out[10]:
Coefficient of determination $R^2$:
In [11]:
r_val**2
Out[11]:
P-value (to test the null hypothesis that the slope is zero):
In [12]:
p_val
Out[12]:
Gamma computed from the previous fitted values:
In [12]:
gamma = (Omega - 1)/(Omega + Sigma - 1)
gamma
Out[12]:
In [13]:
with open('results/usALEX - gamma factor - all-ph.txt', 'w') as f:
f.write(str(gamma))
With the knoledge of $\gamma$ we can compute the corrected FRET efficiencies.
Let define the function to apply gamma correction (from burstlib.py
):
In [14]:
def gamma_correct_E(Er, gamma):
"""Apply gamma correction to the uncorrected FRET `Er`."""
Er = np.asarray(Er)
return Er/(gamma-gamma*Er+Er)
The corrected values are:
In [15]:
E_alex = gamma_correct_E(data.E_pr_fret, gamma)
E_alex
Out[15]:
In [16]:
data.E_pr_fret
Out[16]:
With a non-linear least-squares minimization we are not limited to fit a linear relation.
Let's rewrite the relation without inverting $S_R$:
$$S_R = \frac{1}{\Omega + \Sigma \cdot E_{PR}}$$
In [17]:
import lmfit
from lmfit import minimize, Parameters, report_fit, report_ci, report_errors
In [18]:
def residual(params, Er, Sr, eps_data=None):
slope = params['slope'].value
intercept = params['intercept'].value
model = 1./(intercept + slope*Er)
if eps_data is None:
eps_data = np.ones_like(Er)
return (Sr - model)/eps_data
params = Parameters()
params.add('intercept', value=2)
params.add('slope', value=0.)
out = minimize(residual, params, args=(data.E_pr_fret, data.S_pr_fret))
In [19]:
out.values
Out[19]:
In [20]:
report_fit(params)
In [21]:
Omega = out.values['intercept']
Sigma = out.values['slope']
gamma = (Omega - 1)/(Omega + Sigma - 1)
gamma
Out[21]:
We can compute the confidence intervals (method details) and derive the effect on $\gamma$:
In [22]:
ci = lmfit.conf_interval(out)
report_ci(ci)
Let's compute gamma in the 95% confidence interval for the 2 parameters using a simple brute-force method:
In [23]:
gamma_list = []
for sig_err in np.linspace(ci['slope'][1][1], ci['slope'][5][1], 20):
for omega_err in np.linspace(ci['intercept'][1][1], ci['intercept'][5][1], 20):
Omega = out.values['intercept'] + omega_err
Sigma = out.values['slope'] + sig_err
gamma = (Omega - 1)/(Omega + Sigma - 1)
gamma_list.append(gamma)
In [24]:
np.min(gamma_list), np.max(gamma_list)
Out[24]:
NOTE: Using the 95% confidence interval for the 2 fit parameters we observe a variation on gamma ~ 12%. This only considering the empirical error of the 5 data points respect to the model. Moreover, the underlying assumption is that the errors are equally distributed for all the data points. So a more realistic estimation could be worst than that.
Just for test we can fit the same linear relation we fitted initially with linregress
:
In [25]:
def residual2(params, x, data, eps_data=None):
slope = params['slope'].value
intercept = params['intercept'].value
model = (intercept + slope*x)
if eps_data is None:
eps_data = np.ones_like(data)
return (1./data-model)/eps_data
params2 = Parameters()
params2.add('intercept', value=2)
params2.add('slope', value=0.)
out2 = minimize(residual2, params2, args=(data.E_pr_fret, data.S_pr_fret))
In [26]:
ci = lmfit.conf_interval(out2)
report_ci(ci)
In [27]:
Omega = out2.values['intercept']
Sigma = out2.values['slope']
gamma = (Omega - 1)/(Omega + Sigma - 1)
gamma
Out[27]:
By definition the stoichiometry is independent from E:
$$ S = \frac{\gamma F_D + F_A}{\gamma F_D + F_A + F_{AA}}$$Neglecting $\gamma$ we obtain a raw stoichiometry (analogous of the proxhimity ratio):
$$ S_R = \frac{F_D + F_A}{F_D + F_A + F_{AA}}$$Changes of $S_R$ as a function of $E$ indicates $\gamma \ne 1$.
We can fit $\gamma$ by minimizing the variance of $S$. However, experimentally, we have $E_R$ and $S_R$ and not $F_D$, $F_A$ and $F_{AA}$. We can express $S_R$ as a function of $E_R$:
$$ S_R = \frac{1}{1 + \frac{F_{AA}}{F_D + F_A}} = \frac{1}{1 + E_R\frac{F_{AA}}{F_A}} = \frac{1}{1 + E_R k_2}$$where $k_2 = F_{AA}/F_A$. $k_2$ as a function of $E_R$ and $S_R$ is:
$$ k_2 = \left(\frac{1}{S_R} - 1\right)/E_R$$Using this last definition, we can write $S$ as:
$$S = \frac{1}{1 + E k_2} = \frac{1}{1 + g(E_R, \gamma) k_2} = f(\gamma, E_R, S_R)$$where $g(\cdot)$ function that links $E$ and $E_R$. To find $\gamma$ we can minimize the variance of $S$ (i.e. $f(\cdot)$).
Let start defining $g(\cdot)$ and $k_2(\cdot)$:
In [28]:
def correct_E(Er, gamma):
Er = np.asanyarray(Er)
return Er/(gamma-gamma*Er+Er)
def k2(Er, Sr):
return (1./Sr - 1)/Er
A direct brute-force search gives:
In [29]:
gamma_range = np.arange(0.9, 1.1, 0.00002)
y_std = np.zeros_like(gamma_range)
for i, gamma in enumerate(gamma_range):
y = 1/(1 + correct_E(data.E_pr_fret, gamma)*k2(data.E_pr_fret, data.S_pr_fret))
y_std[i] = y.std()
#plot(y)
gamma_range[y_std == y_std.min()]
Out[29]:
Expressing the fit as a minimization problem gives exactly the same result:
In [30]:
def residual(params, Er, Sr):
gamma = params['gamma'].value
S = 1/(1+correct_E(Er, gamma)*k2(Er, Sr))
errors = S - S.mean()
return errors
params = Parameters()
params.add('gamma', value=1.0)
out = minimize(residual, params, args=(data.E_pr_fret, data.S_pr_fret))
In [31]:
out.values['gamma']
Out[31]:
In [32]:
report_fit(params)
Here we obtained that the standard deviation of the fitted $\gamma$ is around 5%, consistent with the previous estimations.
In [33]:
%matplotlib inline
from matplotlib.pyplot import *
In [34]:
plot(data.E_pr_fret, 1/data.S_pr_fret, 's')
x = np.arange(0, 1, 0.01)
plot(x, intercept + slope*x, 'k')
ylim(1, 2)
xlabel('E (uncorrected)')
ylabel('1/S (uncorrected)')
Out[34]:
In [30]:
In [30]:
In [30]: